CMRULES: An Efficient Algorithm for Mining Sequential Rules Common to Several Sequences
نویسندگان
چکیده
We propose CMRULES, an algorithm for mining sequential rules common to many sequences in sequence databases – not for mining rules appearing frequently in sequences. For this reason, the algorithm does not use a sliding-window approach. Instead, it first finds association rules to prune the search space for items that occur jointly in many sequences. Then it eliminates association rules that do not meet minimum confidence and support thresholds according to the time ordering. We evaluated the performance of CMRULES in three different ways. First, we provide an analysis of its time complexity. Second, we compared its performance on a public dataset with a variation of an algorithm from the literature. Results show that CMRULES is more efficient for low support thresholds, and has a better scalability. Lastly, we report a real application of the algorithm in a complex system.
منابع مشابه
CMRULES: An Efficient Algorithm for Mining Sequential Rules Sequential Rules
We propose CMRULES, an algorithm for mining sequential rules common to many sequences in sequence databases not for mining rules appearing frequently in sequences. For this reason, the algorithm does not use a sliding window approach. Instead, it first finds association rules to prune the search space for items that occur jointly in many sequences. Then it eliminates association rules that do n...
متن کاملCMRules: Mining sequential rules common to several sequences
Sequential rule mining is an important data mining task used in a wide range of applications. However, current algorithms for discovering sequential rules common to several sequences use very restrictive definitions of sequential rules, which make them unable to recognize that similar rules can describe a same phenomenon. This can have many undesirable effects such as (1) similar rules that are...
متن کاملCMRules: Mining Sequential Rules
Sequential rule mining is an important data mining task with wide applications. However, current algorithms for discovering sequential rules common to several sequences use very restrictive definitions of sequential rules, which make them unable to recognize that similar rules can describe a same phenomenon. This can have many undesirable effects such as (1) similar rules that are rated differe...
متن کاملMining Sequential Rules Common to Several Sequences
We present an algorithm for mining sequential rules common to several sequences, such that rules have to appear within a maximum time span. Experimental results with real-life datasets show that the algorithm can reduce the execution time, memory usage and the number of rules generated by several orders of magnitude compared to previous algorithms.
متن کاملA new approach based on data envelopment analysis with double frontiers for ranking the discovered rules from data mining
Data envelopment analysis (DEA) is a relatively new data oriented approach to evaluate performance of a set of peer entities called decision-making units (DMUs) that convert multiple inputs into multiple outputs. Within a relative limited period, DEA has been converted into a strong quantitative and analytical tool to measure and evaluate performance. In an article written by Toloo et al. (2009...
متن کامل